Divide-and-Conquer Reinforcement Learning

نویسندگان

Dibya Ghosh

Avi Singh

Aravind Rajeswaran

Vikash Kumar

Sergey Levine

چکیده

Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization challenging. In this paper, we develop a novel algorithm that instead partitions the initial state space into “slices”, and optimizes an ensemble of policies, each on a different slice. The ensemble is gradually unified into a single policy that can succeed on the whole state space. This approach, which we term divide-and-conquer RL, is able to solve complex tasks where conventional deep RL methods are ineffective. Our results show that divide-and-conquer RL greatly outperforms conventional policy gradient methods on challenging grasping, manipulation, and locomotion tasks, and exceeds the performance of a variety of prior methods. Videos of policies learned by our algorithm can be viewed at https://sites.google.com/view/dnc-rl/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Free Vibration Analysis of Repetitive Structures using Decomposition, and Divide-Conquer Methods

This paper consists of three sections. In the first section an efficient method is used for decomposition of the canonical matrices associated with repetitive structures. to this end, cylindrical coordinate system, as well as a special numbering scheme were employed. In the second section, divide and conquer method have been used for eigensolution of these structures, where the matrices are in ...

متن کامل

Divide-and-Conquer Methods for Solving MDPs

The Markov Decision Process (MDP) is the principal theoretical formalism in the area of Reinforcement Learning (RL). An import from optimal control in operations research, this construct is generic enough to represent problems comprising almost all of AI research, but consequently, it suffers from the curse of dimensionality where learning involves an exponential number of parameters. Researche...

متن کامل

A Greedy Divide-and-Conquer Approach to Optimizing Large Manufacturing Systems using Reinforcement Learning

Manufacturing is a challenging real-world domain for studying hierarchical MDP-based optimization algorithms. We have recently obtained very promising results using a hierarchical reinforcement learning based optimization algorithm for a 12-machine transfer line. Transfer lines model factory processes in automobile and many other product assembly plants. Unlike domains such as elevator scheduli...

متن کامل

The Divide-and-Conquer Manifesto

Existing machine learning theory and algorithms have fo-cused on learning an unknown function from training examples, where the unknown function maps from a feature vector to one of a small number of classes. Emerging applications in science and industry require learning much more complex functions that map from complex input spaces (e.g., 2-dimensional maps, time series, and strings) to comple...

متن کامل

Divide and Conquer: Hierarchical Reinforcement Learning and Task Decomposition in Humans

The field of computational reinforcement learning (RL) has proved extremely useful in research on human and animal behavior and brain function. However, the simple forms of RL considered in most empirical research do not scale well, making their relevance to complex, real-world behavior unclear. In computational RL, one strategy for addressing the scaling problem is to introduce hierarchical st...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

CoRR

دوره abs/1711.09874 شماره

صفحات -

تاریخ انتشار 2017

Divide-and-Conquer Reinforcement Learning

نویسندگان

چکیده

منابع مشابه

Free Vibration Analysis of Repetitive Structures using Decomposition, and Divide-Conquer Methods

Divide-and-Conquer Methods for Solving MDPs

A Greedy Divide-and-Conquer Approach to Optimizing Large Manufacturing Systems using Reinforcement Learning

The Divide-and-Conquer Manifesto

Divide and Conquer: Hierarchical Reinforcement Learning and Task Decomposition in Humans

عنوان ژورنال:

اشتراک گذاری